The main data used in this tutorial and in the lecture are about the geolocalisation of french restaurants in Paris and in a department called Haute-Garonne. We use two different sources:

  1. an official register called SIRENE (Computer system for the business and establishment register) managed by the French National Institute of Statistics and Economic Studies (Insee) and geolocated by Etalab (French task force for Open Data). This register records the civil status of all companies and their establishments (including restaurants).

SIRENE has the advantages of being rigorous and exhaustive on the French territory.

  1. the famous global open access mapping project OpenStreetMap (OSM). We can access to OSM database using the osmdata R package.

OSM has many benefits, ensuring transparent data provenance and ownership, enabling real-time evolution of the database and, by allowing anyone to contribute, encouraging democratic decision making and citizen science.

Exercise 1 : Manipulate sf objects and associated data.frames

1

  1. Load in R the iris1 map layer ‘iris_31.rds’ of the french department called Haute-Garonne (numbered 31). Would the sf::st_read function also work? Why?
Use the readRDS function.
st_read would not work because ‘iris_31.rds’ is not a shapefile but a file already R formatted. Simply load it with the readRDS function.

2

Display the basemap of department 31 with plot(iris_31). What do you notice ?
We notice that R performs 3 graphs: one graph per variable in the sf object.

3

What is the functionality of the sf::st_geometry function? What solution do you propose then?
sf::st_geometry makes it possible to isolate the information contained in the ‘geometry’ column of the sf object. Using it, we put aside other variables (here CODE_IRIS, P14_POP and INSEE_COM).

4

In which projection is the map layer? Map it with another projection.

Information

Test the Azimuthal Equidistant projection with “crs="+proj=aeqd +lat_0=90 +lon_0=0” to see a clear difference and create a layer called ‘iris_31_aeqd’.
Use the sf::st_crs function to guess the projection and sf::st_transform to change the projection.

5

Calculate the distance matrix between the 5 first iris of department 31. Do you get the same distance matrix if you are working on a layer projected in another projection?

Information

Use map layers called ‘iris_31’ and ‘iris_31_aeqd’.
Units: m
         [,1]      [,2]     [,3]      [,4]     [,5]
[1,]     0.00 51153.341 56509.22 53137.481 13756.98
[2,] 51153.34     0.000 20957.21  3086.011 49828.91
[3,] 56509.22 20957.212     0.00 11077.630 61863.18
[4,] 53137.48  3086.011 11077.63     0.000 54345.46
[5,] 13756.98 49828.910 61863.18 54345.458     0.00
Units: m
         [,1]      [,2]     [,3]      [,4]     [,5]
[1,]     0.00 57206.922 62306.32 59385.205 13798.28
[2,] 57206.92     0.000 20957.48  3204.855 55368.23
[3,] 62306.32 20957.481     0.00 11076.220 66744.65
[4,] 59385.20  3204.855 11076.22     0.000 59843.30
[5,] 13798.28 55368.225 66744.65 59843.302     0.00
[1] FALSE
No, the two matrices are different.

6

Using the layer called ‘iris_31’, create a new aggregated map layer called ‘com_31’ which corresponds to the municipalities of department 31. Also keep in this new layer the information on the population in each municipality.

Information

The map layer called ‘iris_31’ contains the 5 digit codes of municipalities in its variable INSEE_COM and the 2014 population in its column P14_POP.
Use the classic functions of dplyr package: select, group_by et summarize. These functions also work with sf objects.

7

Using the data contained in ‘sir_31’, add to this layer the number of restaurants per municipality.

Information

The code of each municipality is not in the ‘sir_31’ database. To create it, you have to create a variable called INSEE_COM (5 digits) which concatenates the DEPET (2 digits) and COMET (3 digits) variables.

8

Aggregate all the information present in ‘com_31’ at the level of french intercommunalites (called EPCI) and call this new layer ‘epci_31’.

Information

You have to use the database ‘table_MAUP.rds’ to have a match between the municipality code (CODGEO) and intercommunality code (EPCI).

9

Using the cartography package, simply plot a map of french intercommunality with a proportional circle layer related to the number of restaurants.
The propSymbolsLayer function allows you to draw proportional circles.

Exercise 2 : Maps with cartography and ggplot2 packages

We would like here to design EPCI maps that combine the number of restaurants and the number of restaurants per 10,000 inhabitants.

1

Data preparation:

  • Load ‘fra.shp’ in R: the map layer of the borders of the French metropolitan territory.
  • Load the layer called ‘epci_31.rds’ (see previous exercise for more details) and create a variable called nb_rest_10000inhab which corresponds to the number of restaurants per 10,000 inhabitants in each territory.
  • Create a vector of quantiles breaks of the nb_rest_10000inhab variable.
  • Create the vector colors which corresponds to a the number of classes defined earlier.
  • Finally, add a variable called typo to ‘epci_31’ which indicates the class of the territory according to the discretization contained in bks for the nb_rest_10000inhab variable (this is needed for ggplot2 maps).

Information

For the creation of ‘bks’ et ‘cols’, use the getBreaks et carto.pal functions of the cartography package. For the creation of the typo variable, you can use the cut function and apply the parameters digit.lab = 2 and include.lowest = TRUE.

2

With the help of cartography package, make the following map which contains in a choropleth layer the variable nb_rest_10000inhab and in a proportional circle layer the variable nb_of_rest. Do the same thing with the ggplot2 package.

With cartography:

With ggplot2:

cartography

ggplot2

3

What other solution could we use to display these two variable on the same map? Try it using the cartography package.
The propSymbolsChoroLayer function allows you to draw colored proportional circles.

4

Using the cartography package, display the number of restaurants and the number of restaurants per 10,000 inhabitants at the municipalities and EPCI scales. The two maps displayed side by side should be as much comparable as possible.
library(sf)
library(cartography)
library(dplyr)
# Import data
fra <- st_read("../data/fra.shp", quiet = TRUE)
epci_31 <- readRDS("../data/epci_31.rds")
com_31 <- readRDS("../data/com_31.rds")

# Create the variable
epci_31$nb_rest_10000inhab <- 10000 * epci_31$nb_of_rest / epci_31$P14_POP
com_31$nb_rest_10000inhab <- 10000 * com_31$nb_of_rest / com_31$P14_POP

# Define breaks for municipalities (we will use the same breaks for both maps)
bks_com <- getBreaks(v = com_31$nb_rest_10000inhab[com_31$nb_of_rest>0], 
                     method = "quantile", nclass = 6)
# Define color palette
cols <- carto.pal("wine.pal", length(bks_com)-1)

# Define plot margins
par(mar = c(0, 0.1, 1.2, 0.1), bg = "azure", mfrow = c(1,2))
# Find EPCI bounding box
bb <- st_bbox(epci_31)
# Plot France using EPCI boundingbox
plot(st_geometry(fra), col="ivory", border = "ivory3", 
     xlim = bb[c(1, 3)], ylim = bb[c(2, 4)])
# Plot EPCI
plot(st_geometry(epci_31), col="ivory3", border = "ivory2", add=T)
# Plot the choropleth layer
propSymbolsChoroLayer(epci_31, var = "nb_of_rest", var2 = "nb_rest_10000inhab", 
           breaks = bks_com, col = cols, border = "ivory3",lwd = 0.6, 
           legend.var.pos = "bottomright", legend.var2.pos = "n",
           add = TRUE, inches = 0.5,
           legend.var.title.txt = "Number of restaurants")
# Add a layout layer
layoutLayer(title = "Restaurants", sources = "Insee, 2018", 
            author = "Kim & Tim, 2018", 
            theme = "green.pal", col = "darkred", 
            coltitle = "white", postitle = "center",
            frame = FALSE, scale = NULL)


bb <- st_bbox(epci_31)
# Plot France using EPCI boundingbox
plot(st_geometry(fra), col="ivory", border = "ivory3", 
     xlim = bb[c(1, 3)], ylim = bb[c(2, 4)])
# Plot EPCI
plot(st_geometry(com_31), col="ivory3", border = "ivory2",lwd = .5, add=T)
# Plot the choropleth layer
propSymbolsChoroLayer(com_31, var = "nb_of_rest", var2 = "nb_rest_10000inhab", 
           breaks = bks_com, col = cols, border = "ivory3",lwd = 0.6, 
           fixmax = max(epci_31$nb_of_rest),
           legend.var.pos = "n", legend.var2.pos = "bottom",
           add = TRUE, inches = 0.5, legend.var2.values.rnd = 0,
           legend.var2.title.txt = "Number of restaurants\nfor 10,000 inhabitants")
# Add a layout layer
layoutLayer(title = "Restaurants", sources = "", 
            author = "", 
            theme = "green.pal", col = "darkred", 
            coltitle = "white", postitle = "center",
            frame = FALSE, scale = 10)
# Add a north (south) arrow
north(pos = "topright", south = TRUE)

Exercise 3 : Smooth Density analysis of restaurants in Haute-Garonne

1

Load the dataset ‘sir_31’ used previously and map the more than 4,000 restaurants of department 31 with the mapview package. Try using different parameters to customize your map.

Information

For example, you can use the map.types, col.regions, label, color, legend, layer.name, homebutton, lwd … parameters of the mapview function.

2

Load the layer ‘iris_31’ used previously. Apply the pt_in_grid function below on the SIRENE data and on the iris of the department 31 to create a grid called ‘grid’ and count points in it. Use a 2km cellsize.

3

Simply use the plot instruction to plot the grid. Then, using the choroLayer function of the cartography package, plot the number of restaurants per kilometer (called ‘n’) in each box of the grid. What observation can you make concerning the readability of the map?

The map is not very readable because of the choice of the discretization.

4

Plot the distribution of the number of restaurants per box in the grid. What do you notice and what solution can you suggest?
The distribution is highly dissymetric. We could use instead a geometric progression classification.

5

Try to make a more readable map.
Test the method = 'geom' and nclass = 9 parameters of the cartography::getBreaks function

6

In order to make the result even more meaningful, we will now make a smoothed map. You can use the compute_kde function below.

Information

Use here a 2km sigma and a 2km resolution and the quantile method (n = 12).

7

Try to improve the map resolution.

8

Would we have obtained the same smoothing for this department if we had kept the restaurants of the bordering departments? Why?
No. The borders would have been different.

Exercise 4 : Cartogram

Make a cartogram of department 31 at the level of intercommunalities in proportion to the number of restaurants (SIRENE data).

Information

Load the layer ‘epci_31.rds’ from previous exercises and the cartogram package.

Normal borders

Cartogram

Exercise 5 : Linemap

Make a linemap of department 31 at the level of municipalities in proportion to the number of restaurants (SIRENE data).

Information

Load the map layer ‘com_31.rds’ from previous exercises and the linemap package.
Use the two getgrid andlinemap functions of this package. The following parameters work: cellsize = 1750, k = 400 and threshold = 0.01.



reproducibility

R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   
[3] LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
[5] LC_TIME=French_France.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] linemap_0.1.0       cartogram_0.1.0     raster_2.6-7        maptools_0.9-2     
 [5] sp_1.2-5            spatstat_1.56-1     rpart_4.1-10        nlme_3.1-128       
 [9] spatstat.data_1.3-1 mapview_2.3.0       ggplot2_3.0.0       cartography_2.1.1  
[13] bindrcpp_0.2        dplyr_0.7.4         sf_0.6-3            knitr_1.20         
[17] stringr_1.3.1      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16         lattice_0.20-34      deldir_0.1-14        png_0.1-7           
 [5] class_7.3-14         assertthat_0.2.0     digest_0.6.15        foreach_1.4.3       
 [9] mime_0.5             R6_2.2.2             plyr_1.8.4           stats4_3.3.2        
[13] evaluate_0.10.1      e1071_1.6-8          tensor_1.5           unilur_0.4.0.9000   
[17] pillar_1.1.0         rlang_0.2.1          lazyeval_0.2.1       R.utils_2.5.0       
[21] R.oo_1.21.0          Matrix_1.2-7.1       goftest_1.1-1        rmarkdown_1.10.12   
[25] rgdal_1.2-11         webshot_0.5.0        foreign_0.8-67       htmlwidgets_1.2     
[29] polyclip_1.9-1       munsell_0.4.3        shiny_1.0.5          httpuv_1.3.5        
[33] pkgconfig_2.0.1      base64enc_0.1-3      mgcv_1.8-15          rgeos_0.3-24        
[37] htmltools_0.3.6      tibble_1.4.2         codetools_0.2-15     viridisLite_0.3.0   
[41] withr_2.1.2          R.methodsS3_1.7.1    grid_3.3.2           jsonlite_1.5        
[45] satellite_1.0.1      xtable_1.8-2         gtable_0.2.0         DBI_1.0.0           
[49] magrittr_1.5         units_0.6-0          scales_0.5.0.9000    stringi_1.1.7       
[53] leaflet_1.1.0        spatstat.utils_1.9-0 RColorBrewer_1.1-2   iterators_1.0.8     
[57] htmldeps_0.1.1       tools_3.3.2          gdalUtils_2.0.1.7    glue_1.2.0          
[61] crosstalk_1.0.0      abind_1.4-5          yaml_2.1.18          colorspace_1.3-2    
[65] classInt_0.1-24      bindr_0.1           

  1. In French, IRIS is an acronym of ‘aggregated units for statistical information’. Their target sizes are 2000 residents per basic unit.